Primordial non-Gaussianity from the covariance of galaxy cluster counts 
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It has recently been proposed that the large-scale bias of dark matter halos depends sensitively 
on primordial non-Gaussianity of the local form. In this paper we point out that the strong scale 
dependence of the non-Gaussian halo bias imprints a distinct signature on the covariance of cluster 
counts. We find that using the full covariance of cluster counts results in improvements on constraints 
on the non-Gaussian parameter /nl of three (one) orders of magnitude relative to cluster counts 
(counts -I- clustering variance) constraints alone. We forecast /nl constraints for the upcoming 
Dark Energy Survey in the presence of uncertainties in the mass-observable relation, halo bias, and 
photometric redshifts. We find that the DES can yield constraints on non-Gaussianity of cr(/NL) ~ 1- 
5 even for relatively conservative assumptions regarding systematics. Excess of correlations of cluster 
counts on scales of hundreds of megaparsecs would represent a smoking gun signature of primordial 
non-Gaussianity of the local type. 



I. INTRODUCTION 

Primordial non-Gaussianity provides cosmology one of 
the precious few connections between primordial physics 
and the present-day universe. Standard inflationary the- 
ory with a single-field, slowly rolling scalar field, predicts 
that the spatial distribution of structures in the universe 
today is very nearly Gaussian random (e.g. [IHS]; for an 
excellent recent review, see 0). Departures from Gaus- 
sianity, barring contamination from systematic errors or 
late-time non-Gaussianity due to secondary processes, 
can therefore be interpreted as violation of this "vanilla" 
infiationary assumption. Constraining or detecting pri- 
mordial non-Gaussianity is therefore an important and 
basic test of the cosmological model. 

Constraints on primordial non-Gaussianity have been 
traditionally obtained from observations of the cosmic 
microwave background, as nonzero non-Gaussianity gen- 
erates a non-zero three-point correlation function (or 
its Fourier transform, the bispectrum) of density fiuc- 
tuations [7lfl3]. Increasingly sophisticated algorithms 
have been developed to constrain non-Gaussianity, |14| - 
[T5] and, to the extent that it can be measured, Gaussian- 
ity has so far been confirmed [T^21j . For example, the 
most recent constraints from the Wilkinson Microwave 
Anisotropy Probe (WMAP) indicate /nl « 32 ± 21 (Itr; 
|22j). where the exact constraints depend somewhat on 
the choice of the statistical estimator applied to the data, 
the CMB map used, and details of the foreground sub- 
traction. Here /nl is the parameter describing non- 
Gaussianity in the widely studied "local" model, where 
the non-Gaussian potential $ng is defined by 



$Ng(2:) = ^g{x) + fM^Ux) - ($^)), 



(1) 



and where $g is the Gaussian potential. Correspond- 
ing constraints can be obtained on other classes of non- 
Gaussian models. For example, for "equilateral" models 
where most power comes from equilateral triangle con- 
figurations, /n1 = 26 ± 140 (Icr; ^). 



The CMB is not the only cosmological probe to be sen- 
sitive to the presence of primordial non-Gaussianity. It 
has been known for a relatively long time that the abun- 
dance of dark matter halos [25Vf2^ (or voids [301 IS]) is 
sensitive to the presence of primordial non-Gaussianity. 
This dependence is easy to understand: halos populate 
the high tail of the probability density distribution of 
structures in the universe, and the shape of this distri- 
bution is sensitive to departures from Gaussianity. How- 
ever, while the halo abundance is rather powerful in con- 
straining models that are non-Gaussian in the density 
(rather than the potential) |32], for the popular models 
of the local type (cf. Eq. ( T])) the abundance is much less 
constraining than the CMB anisotropy and not compet- 
itive with the CMB constraints (e.g. [551153] ). 

Some of us |35] have recently shown that the cluster- 
ing of dark matter halos is very sensitive to primordial 
non-Gaussianity of the local type. This exciting devel- 
opment paves way to using the large-scale structure to 
probe primordial non-Gaussianity nearly three orders of 
magnitude more accurately than using the abundance of 
halos. Dalai et al. |35| found, analytically and numeri- 
cally, that the bias of dark matter halos acquires strong 
scale dependence 
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where 6o is the usual Gaussian bias (on large scales, 
where it is constant), 5c ~ 1.686 is the collapse thresh- 
old, a is the scale factor, rj^/ is the matter fraction rel- 
ative to critical, is the Hubble constant, k is the 
wavenumber, T(k) is the transfer function, and g{a) is 
the growth suppression factor^. This result has been con- 



^ The usual linear growth D{a), normalized to be equal to a in 
the matter-dominated epoch, is related to the suppression factor 
g{a) via D{a) = ag{a). 
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firmed by other researchers using a variety of methods, 
including the peak-background split [36H38] . perturba- 
tion theory [55^HT] , and numerical (N-body) simulations 
[12^-13^ . Astrophysical measurements of the scale depen- 
dence of the large-scale bias, using galaxy and quasar 
clustering as well as the cross-correlation between the 
galaxy density and CMB anisotropy, have recently been 
used to impose constraints on /nl already comparable 
to those from the cosmic microwave background (CMB) 
anisotropy [36l|38], giving /nl = 28±23 (Icr), with some 
dependence on the assumptions made in the analysis |38j . 
In the future, constraints on /nl are expected to be of 
order a few [33 1351 HHj . The sensitivity of the large-scale 
bias to other models of primordial non-Gaussianity has 
not been investigated yet (though see preliminary analy- 
ses in imilSj). 

Clustering of galaxy clusters, in particular, can very 
strongly constrain primordial non-Gaussianity. Clusters 
have an advantage of being large, relatively simple ob- 
jects that are easy to find using either optical or X- 
ray light, or else from their Sunyaev-Zeldovich signature. 
Clusters already provide interesting constraints on dark 
energy [IHl EO] and they hold promise for precision mea- 
surements of cosmological and dark energy parameters 
(e.g. [51]). Since clusters are massive and hence signif- 
icantly biased objects, their counts (via the mass func- 
tion) and clustering (via the mass function and bias) are 
both sensitive to primordial non-Gaussianity. Recently, 
Oguri '5T has argued that the variance of cluster counts 
(i.e. scatter measured in each cell individually), in com- 
bination with the cluster counts, leads to interesting im- 
provements on /nl constraints relative to the counts-only 
case. 

In this paper we point out that including the covari- 
ance of cluster counts in angle and redshift leads to 
very significant further improvements in the cluster con- 
straints on local primordial non-Gaussianity. The prin- 
cipal reason for the improvement is simple: covariance is 
determined by the cluster power spectrum, which is pro- 
portional to the halo bias squared. At large scales, the 
non-Gaussian contribution to the halo bias dominates (cf. 
Eq. ([2|), and this results in a strong /nl signal in the co- 
variance. Furthermore, we explore the sensitivity of the 
constraints to various assumptions about statistical and 
systematic errors in modeling the cluster mass-observable 
relation, as well as the presence of other cosmological pa- 
rameters. We find that the bulk of the information about 
local non-Gaussianity comes from the far-separation co- 
variances of cluster counts-in-cells. 

This paper is organized as follows. In Sec. |lT] we de- 
scribe the methodology that we use to obtain constraints 
from both counts and clustering of galaxy clusters. In 
Sec. |I1I| we describe our fiducial assumptions about the 
cosmological model and data as well as solutions to var- 



ious challenges in calculating the constraints. In Sec. IV 
we describe the forecasted constraints on /nl from the 
Dark Energy Survey. We discuss our results in Sec. |Vj 
and conclude in Sec. IVII 



II. METHODOLOGY 

We address the following problem: how well can the 
cosmological parameters be recovered using counts of 
galaxy clusters in pixels distributed in angle and radius 
on the sky? We largely follow the formalism of Hu and 
Cohn [53| and Lima and Hu [54] . 

Assume that clusters are counted in square pixels of 
fixed angular size 6'pix, corresponding to comoving size 
Lpixiz) = 0pixr{z), where r is the comoving distance. 
The clusters are also binned in the mass-observable (i.e 
the observable proxy for cluster mass), with intervals 
[^^obs' -^^obs^] where a refers to a specific mass-observable 
bin. The number density of clusters at a given redshift 
z with observable in the range Af^bs — ^^obs < ^^ohs^ 
given by 
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where p{AIc,hs\M) is the observable-mass relation (ex- 
plained in Appendix [a]) and dn/dhiM is the mass func- 
tion. Uncertainties in the redshifts distort the volume 
element; we fully take into account the photometric red- 
shift uncertainties following 55J ; details are shown in Ap- 
pendix [b] 

We adopt the mass function from Dalai et al. [3S] who 
used N-body simulations to parametrize the shift in mass 
of a typical halo in the presence of non-Gaussianity. The 
mass shift, Mq — >• M, is adequately described by a Gaus- 
sian with mean and variance respectively given by 
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1.3x 10-VNLf78a(MG,z)-2 (4) 
1.4 X 10-4 (/NLa8)°-V(MG, zy\{5) 



where cr[M, z) is the amplitude of mass fluctuations on 
mass scale M and at redshift z. The final non-Gaussian 
mass function is given by [35] 
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where dP / dM {Mq) is the probability distribution that a 
Gaussian halo of mass Mg maps to a non-Gaussian halo 
of mass M , and is given by the Gaussian with the mean 
and variance given in Eqs. (|4]) and a. For dn/dMc, we 
adopt the Jenkins mass function ^Of. 

On large scales, the number counts of clusters m(x) 
trace the linear density perturbation (5(x) 

m,{Ma,x) = niia = mi{l + b{Ma, z)6{x)) (7) 

where i refers to the pixel (i.e. its angular and radial 
coordinates), and a indicates the mass bin. The spatial 
covariance of cluster counts is [ST] 

S" = {(niia - mia){mja - fhja)) = rhiaffljaCij, (8) 
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where is the pixel real-space correlation function 
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|Ty,(k)M(,-(k)|cos(fc,Aa;,y) x 



cos{kyAy^j) cos{k^Azij)biabjaPik; z). (9) 

If i and j come from different redshift bins, the geometric 
mean of the two redshifts^ is adopted for z. In the limit 
Tij ^ ipix, £,fj — > £,{rij), where the latter quantity is 
the standard two-point correlation function in real space. 
Axij = LpixUxij is the physical separation between i and 
j in the x direction (transverse to the line of sight), and 
Uxij is the number of pixels separating them; Ay^ is 
defined equivalently. Finally, the window function W is 
the Fourier transform of the square pixel in the presence 
of redshift errors 



Wik), = exp 



~2HT~ 



(10) 



io(^xipix/2)jo(fca-^pix/2)io(fczAz/2iJi) , 

where the index i refers to the redshift bin, Uza is the red- 
shift scatter at the radial distance corresponding to the 
ith pixel, and Hi is the Hubble parameter. The photo-z 
bias is implicit in the Az^- term in Eq. Q. 

The expression for the full Fisher matrix for galaxy 
cluster counts and their covariance is quite complicated 
(see [53j ) , but a reasonable approximation is given by |58j 
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Tr[C-iS,^C-iS,; 



(11) 



where the first term encodes information from cluster 
counts, and the second from the covariance. Here and 
V are indices that refer to both cosmological and nui- 
sance parameters (including /nl)- The cluster counts 
have been arranged as the vector m. S = {Sfj} is the 
sample covariance matrix from Eq. ^ , and C = N -t- S 
is the total covariance. Nij = fhiSij is the (shot) noise 
matrix. The derivative with respect to /nl can be com- 
puted analytically, using the fact that P{k,z) cx b'^{k,z) 
and Eq. 



III. FIDUCIAL ASSUMPTIONS AND 
CALCULATIONAL CHALLENGES 

We implement the procedure outlined above for opti- 
cally selected clusters. In our fiducial setup we divide the 



^ In the linear regime, the correlation between pixels i and j con- 
tains the product of the growth factors corresponding to Zi and 
Zj. Therefore, the corresponding power spectrum, P{k,z) in 
Eq. should use the growth function equal to the geometric 
mean of the two growth functions. Instead, we effectively use 
the growth function which is evaluated at the redshift equal to 
the geometric mean of the two redshifts Zi and Zj . Results are 
insensitive to this approximation, specially because most of the 
information comes from relatively close redshift pairs. 



sky into the 11 x 11 field of pixels of 41.32 sq. deg. each, 
for a total of 5,000 sq. deg. which matches expectations 
for the Dark Energy Survey (DES). The facing surface 
of each pixel is a square with a side Lpix(z) = 0-piyT{z) 
(see Sec. |ll|. Each pixel has redshift depth Az = 0.2, 
and we assume a maximum redshift of 1.0 so that there 
are five redshift bins. We adopt the mass threshold of 
M"^ = lO^'^'^/i~^M0 and also bin in mass, using 5 mass 
bins of width AlnM"^ — 0.2, with the exception of 
the highest-mass bin, which we extend to infinity. Us- 
ing smaller bins in angle or redshift yields better results, 
up to the point where the covariance matrix becomes 
dominated by shot-noise (which occurs for bins with area 
around 0.1-1 sq. deg.). For very large number of pixels, 
the Gaussian approximation used to define the covari- 
ance used in our Fisher matrix would break down. In our 
fiducial case we have about 1.7 x 10^ clusters subdivided 
into 3, 025 pixels, so that we are well within the Gaussian 
regime. In addition, results for large angular pixels are 
less sensitive to systematics due to non-linear physics or 



angular mask uncertainties. In Sec. IV we consider de- 
partures from the fiducial assumption, namely variations 
in the mass-threshold, maximum redshift range and pixel 
area. 

We assume fiducial cosmological parameters based on 
the fifth year data release of the Wilkinson Microwave 
Anisotropy Probe [SS]. Thus, we set the baryon den- 
sity, Vibh? = 0.0227, the dark matter density, Vtmh? = 
0.1326, the normalization of the power spectrum at A: = 
0.05Mpc~\ 5q = 4.625 x 10-^ the tilt, n = 0.963, the 
optical depth to reionization, r = 0.087, the dark en- 
ergy density, JIde = 0.742, and the dark energy equation 
of state, w = —1. In this cosmology, cs = 0.796. We 
use CMBfast [30], version 4.5.1, to calculate the transfer 
functions, and add Planck priors^ when calculating the 
marginalized constraints on parameters. 

To study systematic errors in cluster cosmology, we 
add a generous set of nuisance parameters described in 
Appendix [A| (see also Cunha [61 and Cunha et al. 51), 
with 10 nuisance parameters describing the bias and vari- 
ance in the mass-observable relation and 3 parameters 
describi ng u ncertainty in the halo bias (oc, Pc, and 5^, 
cf. Eq. (B5)). The assumption of 3 nuisance parameters 
describing the Gaussian halo bias is somewhat ad hoc 
but conservative since for a given mass function the halo 
bias can be predicted to roughly 10% accuracy in the 
range of scales we are interested We fix the photo- 



z scatter to 0.02 everywhere except in Sec. IV D where 



we consider the effects of including 10 additional nui- 
sance parameters describing photometric redshift errors. 
In this exploratory paper we do not consider models for 
non-Gaussianity other than the one from Eq. ([l]), or ob- 
servational systematic errors (e.g. atmospheric blurring 
or completeness variations across the sky). The study of 



W. Hu, private communication 
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FIG. 1: Left panel: Sensitivity of the variance of cluster counts to non-Gaussianity. The black lines shows the variance Su, the 
short-dashed red line shows the (auto)correlation function a-nd the long-dashed blue line shows the squared mean counts. 
Note that Sa = (rn)^^2 We assumed a pixel with area 40 sq. deg. and radial redshift extent Az — 0.2, centered at z — 0.5. Right 
panel: Sensitivity of the covariance of cluster counts to non-Gaussianity. We show the off-diagonal elements of the clustering 
matrix, normalized by the variance of /nl = case {S^'^'^^^) as a function of angle between the ith and jth pixel. We use the 
same pixelization as for the left panel. We show the Gaussian case (/nl = 0), and four non-Gaussian models (/nl = ±20 and 
/nl = ±100). Note that, because of the regularization, the results depend on the size of the survey. The larger the survey, the 
larger the effect of non-Gaussianity. 



these effects is left for future work. 



Evaluating the expression for the Fisher matrix with 
the signal matrix of this size is clearly challenging: the 
total size of the matrix S (see Eq. ([9])) is iV x A^, where 

N = iVpixcls X Admass X iVredshift = 121 X 5 X 5 = 3, 025 

in our fiducial case. The bottleneck is in calculating the 
~ lO*" elements of the matrix, each of which involves 
the numerical computation of a rapidly oscillating triple 
integral; see Eqs. ([8| and Unlike previous works 

which studied constraints on dark energy [S31 [SFl 155] . 
we cannot ignore the off-diagonal elements (i.e. the pixel 
covariance) of the matrix S since those elements, while 
being very small for the Gaussian case, become signifi- 
cant for /nl 7^ (see the right panel of Fig. [T]) due to 
the /nl^~^ dependence scaling of the power spectrum 
as fc — > 0. To reduce the size of the covariance matrix 
we assume that the information from the different mass 
bins is independent, so that we can estimate the Fisher 
matrix for each mass bin separately and then add them 
in the end. The scatter in the mass-observable relation 
can generate correlations between mass bins at a given 
pixel. In addition, as Seljak [63] and McDonald and Sel- 
jak [64] noticed, correlating the halos of different masses 
at large separations would lead to improved constraints 
in our analysis, making our assumption conservative. 



A. Regularization of the covariance 

As Wands and Slosar |65j pointed out, the two-point 
correlation function for biased tracers of structure has an 
infrared divergence if /nl is not zero. However, the mea- 
sured correlation function from any survey is of course 
finite because one cannot measure variance of the den- 
sity field on scales larger than the survey. To that effect. 
Wands and Slosar [65 suggest regularizing the correla- 
tion function ^(r) by subtracting from it the variance 
of the density field evaluated at the scale of the sur- 
vey. However, Cunha and Slosar (private communica- 
tion) found out that the regularization of Wands and 
Slosar [65| contains a typo; the correct regularization 
term is given by 

j^\WR{k)\b,^b,^Pik;z), (12) 

where we use the mass bin a and redshift bin i corre- 
sponding to those of the correlation function from 
which this is being subtracted. If i and j come from dif- 
ferent redshift bins, the geometric mean of the two red- 
shifts is taken. The difference from what is presented in 
Wands and Slosar [HS] is that our expression has |Vl^fl,(k)| 
instead of |W^i?(k)|2 (cf. Eqs. 47, 49 and 50 in |S5]). Us- 
ing the above expression, the observed 2-pt correlation 
at a given survey volume has the desirable property that 
it integrates to zero over the survey volume. 
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We approximate the window function |Wfl(k)| as the 
Fourier transform of a spherical top-hat, and adopt R = 
2/i^^Gpc as the linear dimension of our survey. For the 
main analysis in this paper, the effects of the divergence 
are not significant, since all of our results (except in Sec. 
|v| assume zero fiducial /nl, and the analytic expression 
for the derivative dSij/df^i^ is weakly sensitive to the in- 
tegration boundary. The divergence of the two-point cor- 
relation does affect the covariance for non-zero /nl and 
for pixel separation greater than a few hundred Mpc. We 
use the lower boundary of integration fcmin = 10"'', and 
check that results are stable vis-a-vis variations in this 
value, or whether the regularization mentioned above has 
been applied or not. For Fig. [T]and the results in Sec.|V) 
we do apply the corrected Wands-Slosar regularization 
prescription (cf. Eq. 12). 



. That is, we assume that 



Besides its impact on the regularization, the choice of 
survey geometry is important since the distribution of 
pixel-pixel separations depends on the geometry. We as- 
sume that the survey itself has square shape (and im- 
plicitly work in a flat-sky approximation), and assume 
a 11 X 11 field of square-shaped pixels for each redshift 
bin. To populate the covariance matrix, we precompute 
Sij as a function of pixel separation for integer values of 
the separation along a row of pixels in Eq. ^ — that 
is, we set Ay^ = and vary Ax^j at each redshift. We 
use linear interpolation to estimate the covariance for 
pixels whose physical separation, in units of Axj(i_|.i), is 
non-integer. We find that the effects of disregarding the 
pixel orientation are negligible (by changing the orien- 
tation of bins and finding little change in the results). 
Pre-computation of the covariance matrix elements as a 
function of pixel separation greatly reduces the number 
of covariance terms we need to calculate. 

As the right panel of Fig.[l]shows, in the Gaussian case 
the off-diagonal terms of Sij fall off very fast. We find 
that covariance terms for pixels in different redshifts to 
be negligible, because we use broad redshift bins. Hence, 
we only calculate covariance between different redshift 
bins when estimating the derivative of the covariance 
with respect to /nl- To save time, for the results shown 
in |IV| we only calculate terms in adjacent redshift bins. 
We checked that including larger redshift separations im- 
proves unmarginalized constraints by about 30%. But in- 
cluding the regularization removes most of the improve- 
ment (for fiducial /nl = 0). To calculate the derivatives 
of the covariance with respect to /nlj we use the fact that 
the derivative of the bias with respect to /nl is analytic 
so that 
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The terms we ignore correspond to the sensitivity of clus- 
ter counts to non-Gaussianity, and they would only en- 
hance the impact of /nl: though slightly, as will be shown 
in the following sections. In a real survey one actually 
has to calculate the covariance at non-zero values of /nl 
for which our approach of evaluating the derivative ana- 
lytically at /nl = would be insufficiently general. For 
this sensitivity study, however, the analytic derivative is 
perfectly acceptable. We examine the sensitivity to the 
constraints around different fiducial values of /nl in Sec. 
lYl 



IV. RESULTS 

Our results are presented as follows. First we discuss 
the sensitivity of cluster counts and clustering of counts 
to /nl, and examine unmarginalized constraints on /nl- 
Second, we examine the degeneracies with cosmological 
parameters and nuisance parameters due to modeling un- 
certainties in the observable-mass relation and in the halo 
bias. Third and last, we look at the impact of photomet- 
ric redshift uncertainties. 



A. Sensitivity of cluster covariance 

The effect of non-Gaussianity on clustering is a combi- 
nation of several effects, which can be identified from 
Eq. Q. The dominant effect is due to the explicit 
modification of the halo bias (Eq. (j2])) which affects 
(cf. Eq. ([9])) In addition, non-Gaussianity affects the 
mass function, which affects the mean cluster counts (cf. 



In calculating dSij / df^i^, we only keep the dominant 
term, which is the one with derivative with respect to 



3| an d Bl), and the average cluster linear bias (cf. 
( B7[ )). The left panel of Fig. [l] shows the depen- 
dence of the different terms that make up the clustering 
covariance Sij , as a function of /nl - For this sensitivity 
plot, we assume a 40 sq. deg. pixel with redshift thickness 
Az — 0.2 centered around z = 0.5 and a mass-threshold 
M"^ = lO^^ ''/i~^M0, and show only the diagonal ele- 
ments i = j for clarity. The relation between the func- 
tions plotted in this figure is 5*^^ = m'^£,"j- It is apparent 
from the figure that fg- encodes most of the dependence 
of the clustering signal on /nl, and that the clustering 
covariance {Sij, or ) is much more sensitive to /nl than 
the mean counts m. As mentioned previously, we neglect 
the implicit mass function dependence of /nl when cal- 
culating the covariance. Including it would only enhance 
the impact of /nl, albeit slightly. 

In the right panel of Fig.jljwc plot the absolute value 
of the clustering covariance as a function of angular sep- 
aration between the centroids of two pixels. For refer- 
ence, at z = 0.5, a one-degree separation corresponds 
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Max. angular separation [deg.] Max. angular separation [deg.] 

FIG. 2: 1 — (7 uncertainties in the parameter /nl as a function of the maximum angular separation between pixel centroids in 
the covariance matrix. The left panel shows the unmarginalized constraints while the right panel shows marginalized constraints 
assuming Planck priors and fixed halo-bias and observable- mass nuisance parameters. Zero separation indicates the case of 
pure variances (as considered by Oguri |52j). The maximum angular separation between pixels for a 5,000 sq. deg. survey 
divided into 41.3 sq. deg pixels is about 90 degrees (or 10-\/2 pixel widths). This case would correspond to taking the full 
covariance into account for the calculation of /nl, but disregarding the covariance between different redshift bins. The blue 
short dashed line corresponds to constraints derived using only cluster counts. The red dashed line shows the constraints when 
only the clustering of clusters is used, and the solid black line shows the combined constraints from counts and clustering. 



to about 23.4/i~^Mpc. For /nl = 0, the clustering co- 
variance is large and positive at small separations, but 
becomes negative at intermediate pixel separations (~ 6 
deg or ~ 150/i~^Mpc at z = 0.5); this behavior corre- 
sponds to a similar behavior of the two-point correlation 
function ^(r) (see e.g. Ref. ^S]). The effect of nonzero 
/nl depends on its sign as well as on the scale. For posi- 
tive /nl, the covariance increases monotonically with /nl 
roughly up to the scale of the survey. Beyond that scale 
(~ 60° in our example), the covariance reverses its trend 
with /nl and becomes negative due to the integral con- 
straint imposed by the regularization. For negative /nl, 
the dependence of the covariance Sij on /nl is more com- 
plicated because the total bias becomes negative at large 
enough scales; thus, for /nl < the covariance depends 
monotonically on |/nl| only on scales {< 7° in the right 
panel of Figjll) for which the bias correction - second 
term in Eq. ([2f^- is subdominant. Note that Fig. [l] hides 
the fact that the number of pixels at a given separation 
increases with separation: the number of off-diagonal el- 
ements in the covariance is much bigger than the number 
of diagonal elements, and this gives a '"geometric boost" 
to the covariance. 



B. Unmarginalized constraints from clustering and 
counts 

Both panels of Fig. [2] show /nl constraints as a func- 
tion of the maximum pixel separation allowed in the co- 
variance (cf. Eq. [8| used to generate the Fisher matrix 
constraints (cf. Eq. [TT] ). 

In the left panel of Fig. [2]we see that the cluster counts 
yield better unmarginalized constraints than the variance 
of cluster counts alone; however, once the covariances (i.e. 
off-diagonal terms of the signal matrix Sij) are included, 
the clustering information rapidly beats that from the 
counts. In Table [l] we show the unmarginalized /nl con- 
straints for a variety of survey expectations. Changes in 
the constraints improve in the direction expected: the 
lower the mass-threshold and the higher the maximum 
redshift, the better. This Table also shows that decreas- 
ing the angular area of the pixels to 12.5 deg^ results in 
substantial (0(50%)) improvements. The improvement 
with decreasing pixel size, for /nl constraints, does not 
happen if we consider only the variance in counts. For 
other parameters, that are sensitive to small scale in- 
formation, such as r^DE and w, the smaller pixels do 
translate into better constraints even if only the sample 
variance is used. Further refinements of the pixelization 
leads to improvement up to the regime of shot-noise dom- 
ination, (which occurs for pixels of '-^ 0.1 — 1 deg^). Un- 
marginalized constraints are of order 10"^ in this regime. 
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Unmarginalized error a{fNh) 



Assumption 


Number 


Counts 


Covariance 


Both 


Fiducial 


1.7 X 10^ 


9.1 


1.8 


1.7 


12.5 deg^ pix 


1.7 X 10^ 


9.2 


1.1 


1.1 


2^max — 0.8 


1.3 X 10^ 


13 


2.3 


2.2 


^max — 1.4 


2.4 X 10^ 


6.0 


1.4 


1.4 


M"' = lO^-^'^ 


3.6 X 10^ 


8.3 


1.4 


1.4 




7.7 X 10* 


10 


2.3 


2.3 



TABLE I: Unmarginalized constraints on /nl- The fiducial 
case assumes no nuisance parameters, 5 bins in mass and red- 
shift each, and other assumptions as in the text. Variations 
in the assumptions are shown in the first column, followed 
by the total number of clusters in the 5,000deg^ survey we 
assumed, while cluster counts, covariance, and combined pro- 
jected 1-a constraints on /nl are given in the following three 
columns. 



though observational systematics are likely to dominate 
statistical errors of this size. 



C. Degeneracies with cosmological and nuisance 
parameters. 

In the right panel of Fig. [2] we show the marginalized 
constraints on /nl assuming Planck priors and fixed nui- 
sance parameters (both halo bias and mass-observable). 
We see that the change in the constraints from combined 
counts'* and clustering is even more remarkable than the 
unmarginalized constraints shown in the right panel. The 
full clustering covariance yields about one order of mag- 
nitude better constraints than if only the variance is used. 
As we shall see, this fractional improvement remains even 
when we include nuisance parameters. 

Tables [n] and III show /nl constraints using the vari- 
ance of cluster counts, and the full covariance, respec- 
tively. The results assumed Planck priors on the cos- 
mological parameters, 10 nuisance parameters describing 
the mass-observable relation and 3 nuisance parameters 
describing uncertainties in the Gaussian halo bias. 

Comparing the last columns of Tables [IT] and |TTTj we 
see that the counts-l-covariance combination yields about 
an order of magnitude improvement over simply using 
counts+variance. For the counts-|-variance, the uncer- 



tainties in the halo bias parameters are the main source 
of degradation to /nl constraints. Without the infor- 
mation from large separations provided by the full co- 
variance, the Fisher matrix cannot disentangle the ef- 
fects due to the Gaussian bias from the /nl contribu- 



tion. When the full covariance is used (cf . Table III I , the 
errors in the mass-observable relation are the dominant 
source of degradation. Marginalizing over all nuisance 
parameters, assuming flat priors, yields a degradation of 
^ 3 in (t(/nl)- This is not large, considering we added 
13 nuisance parameters, but not negligible either. Even 
modest prior information can improve the marginalized 
constraints significantly. 

There are two principal reasons for the strong improve- 
ment of errors when the covariance is added: 



The strong scale dependence of the bias as a func- 
tion implies that most signal comes from the co- 
variances, since the covariances have longer lever 
arms in k than the variance alone (and are much 
more sensitive than counts which only depend on 
non-Gaussianity via the mass function); 



The signature of /nl in the covariance is unique, as 
no other cosmological parameter leads to a similar 
effect — therefore, the degeneracy with other cos- 
mological parameters is very small, as first noted 
by [35]. 



Comparing the /nl constraints for the full covariance for 



The slight degradation in /nl constraints from counts seen in the 
right panel is real, and is due to adding the (positive) covariance 
matrix elements to the counts noise; see the first term on the 
RHS of Eq. InJ. Using the full covariance therefore yields very 
slightly worse constraints. 



fixed nuisance parameters (Table III ) to the unmarginal- 
ized constraints (Table |l|, we see that degeneracies with 
cosmological parameters only result in a small degrada- 
tion of /nl constraints (from 1.7 to 1.8). 

Tables |TT] and IIIII also show the constraints obtained 
using counts alone, or (co)variance by itself. The in- 
formation about /nl from the counts is very degener- 
ate with the cosmological and nuisance parameters. The 
"oo" symbols indicate that the Fisher matrix could not 
be inverted, i.e., that particular technique was unable 
to simultaneously constrain all of the parameters. From 
the last row of both tables, we see that cluster counts 
are effective at constraining the cosmological parameters 
and mass-observable relation (from the mass binning) 
whereas the (co)variance constrains mainly the nuisance 
parameters and /nl- 

Marginalization degrades the counts -I- covariance /nl 
constraints roughly independently of the different survey 
assumptions, so one can use Table |l] to infer marginalized 
constraints. For example, from Table [Tj we see that us- 
ing 12.5 deg^ pixels yields about 60% better constraints. 
The full marginalized constraints are also improved by 
a similar factors so that, for example, (t(/nl) ~ 3.9 for 
12.5 deg^ marginalized over the 13 nuisance parameters 
(compared to cr(/NL) = 6.0 for 40deg^ pixels). 
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Marginalized errors - Variance only 



Nuisance parameters Counts Variance Counts+Variance 

Halo bias Mobs IT(nDB) Cr{w) Cr(/NL) Cr(fiDE) (Jiw) Cr(/NL) Cr(r2DE) Cr(TO) (t(/nl) 



Marginalized Marginalized 


oo 


oo 


oo 


oo 


oo 


oo 


0.075 


0.25 


55 


Known Marginalized 


0.095 


0.32 


3.4 X 10^ 


oo 


oo 


oo 


0.061 


0.21 


27 


Marginalized Known 


oo 


oo 


oo 


0.077 


0.26 


98 


0.0037 


0.016 


44 


Known Known 


0.0046 


0.021 


91 


0.053 


0.18 


67 


0.0035 


0.014 


19 



TABLE II: Marginalized constraints on /nl and dark energy with cluster counts, variance of the counts, and the two combined. 
The fiducial case assumes 5 bins in mass and redshift each with a mass threshold Af"" — 10^^'^, maximum redshift Zmax ~ 1.0, 
and other assumptions as in the text. Assumptions about the nuisance parameters are varied, and are shown in the first two 
columns. Entries with oo indicate that the method was unable to constrain the parameters. 



Marginalized errors - Full Covariance 



Nuisance parameters Counts Covariance Counts+Covariance 

Halo bias Mobs (^(^de) cr{w) (t(/nl) (^{^^de) cr(w) cr(/NL) cr(fiDE) o-(iu) o-(/nl) 



Marginalized Marginalized 


oo 


oo 


oo 


oo 


oo 


oo 


0.069 


0.23 


6.0 


Known Marginalized 


0.097 


0.33 


2.1 X 10^ 


0.13 


0.43 


12 


0.065 


0.22 


5.4 


Marginalized Known 


oo 


oo 


oo 


0.099 


0.34 


7.0 


0.0036 


0.014 


3.8 


Known Known 


0.0051 


0.023 


94 


0.042 


0.13 


5.1 


0.0036 


0.014 


1.8 



TABLE III: Marginalized constraints on /nl and dark energy with cluster counts, covariance of the counts, and the two 
combined. The fiducial case assumes 5 bins in mass and redshift each with a mass threshold M*'' = lO" ^ maximum redshift 
Zmax = 1.0, and other assumptions as in the text. Assumptions about the nuisance parameters are varied, and are shown in 
the first two columns. Entries with oo indicate that the method was unable to constrain the parameters. 



D. Photometric redshift errors 

To study the effects of photometric redshift errors, we 
add 10 nuisance parameters to the analysis, namely two 
parameters — one each describing the photo-z scatter 
and bias — in each of the five redshift bins. The results 
are summarized in Table ITVl 

If either the halo bias or the mass-observable nui- 
sance parameters are fixed, then the degradation from 
the inclusion of photo-z's is not very damaging. In other 
words, the additional correlations between either photo-z 
and halo bias parameters, or between photo-z and mass- 
observable parameters, do not cause substantial addi- 
tional degradation to /nl constraints (relative to the case 
where only the photo-z parameters are unknown). 

However when all 23 nuisance parameters (10 for the 
photo-z's, 10 for the mass-observable relation, and 3 for 
halo bias) are left free, one cannot simultaneously con- 
strain dark energy and /nl, and the constraints on both 
drastically degrade. We traced the biggest source of 
degradation to the redshift evolution parameters in the 
mass-observable relation and to the photo-z bias nui- 
sance parameters. Simply adding a 33% prior to the 



one parameter describing the evolution of the bias in 
P{Mohs\M) (parameter ai in Eq. (A3|) was enough to 
reclaim respectable accuracy, with ct(/nl) — 18-8 (see 
the bottom row of Table IV). Alternatively, if the bias 



in each photo-z bin is known to the absolute accuracy of 
0.01 with all other parameters free, then ^(/nl) = 7.0, 
which is just ~ 15% worse than when photo-z parame- 
ters are fixed^. For a survey such as the DES, these re- 
quirements should be relatively easy to satisfy, given that 
spectroscopic samples of 10'*- 10^ galaxies will be avail- 
able to calibrate the photometric redshift errors (see e.g. 
Eqs. (19) and (20) in Hearin et al. [ST]). 



^ Unlike /nl , the dark energy constraints are sensitive to both bias 
and scatter of the photo-z's. For a prior uncertainty in the photo- 
z bias of 0.01 per bin, the photo-z scatter needs to be known to 
0.025 per bin to achieve small (< 15%) degradation in ^(Qoe) 
and a-{w) relative to the case of perfectly known photo-z errors. 
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The effects of plioto-z uncertainties 



Nuisance parameters 








Halo bias 


Mobs 




a[w) 


(t(/nl) 


Known 


Known 


0.016 


0.041 


6.5 


Marginalized 


Known 


0.021 


0.053 


6.7 


Known 


Marginalized 


0.11 


0.36 


9.4 


Marginalized" Marginalized" 


0.23" 


0.77" 


19" 



TABLE IV: Effect of photometric redshift uncertainties on 
the marginalized constraints on /nl. The fiducial case as- 
sumes 5 bins in mass and redshift each with a mass-threshold 
M"' = lO^'^'^ and maximum redshift Zmax ~ 1.0, and other 
assumptions as in the text. Variations are in the first two 
columns, while cluster, covariance, and combined projected 1- 
a constraints on /nl are given in the following three columns. 
In the bottom row, superscript " signals that a Fisher matrix 
prior of Fa^^ai = 10 is added to the nuisance parameter a\ 
defined in Eq. ( |A3[ |, which describes the redshift evolution of 
the bias in the mass-observable relation. 



DISCUSSION 



Choice of the fiducial model 



In our fiducial approach we estimated errors in /nl 
around /nl = 0. However, it is a slightly different matter 
to estimate the detectability of non-Gaussianity, which 
requires estimating the signal-to-noise at which a non- 
zero fiducial value of /nl can be differentiated from zero^. 
The detectability is independent of the fiducial value if 
the observable quantity is hnear in the parameter (s); this 
is clearly not the case here since the clustering signal is 
a quadratic function of the bias, which itself depends 
linearly on /nl- 

Fig. [3] shows the fiducial unmarginalized constraints on 
/nl as a function of its fiducial value. Unlike in the re- 
sults shown previously, here we calculate all elements of 
the covariance matrix and its derivative with respect to 
/nl (which is why the constraints for /nl = shown in 
the plot are slightly better than what is shown in Table 
|l]). The figure shows tightest constraints for |/nl| — 10 
— more than 4 times stronger than those for our fiducial 
assumption of /nl = 0. The "witch's hat" shape shown 
in Fig. [3] can be understood by examining the second 
term on the RHS of Eq. (Ill that contains the Fisher 
information from the covariance of cluster counts. The 
/nl constraints are set by the competition between the 



® Arguably the best approach might be to use the Bayesian model 
selection techniques and, for a range of /nl values, test if the 
hypothesis /nl = can be rejected. We do not pursue such an 
approach in this paper. 



signal, represented by the derivative of the covariance 
with respect to /nl, S_^, and the noise, given by the to- 
tal covariance, C. These two quantities vary with /nl 
at different rates; the total covariance depends (roughly) 
quadratically on /nl whereas only has a linear de- 
pendence. In addition, the matrix elements of S^^ and C 
have different sensitivity to /nl at each angular separa- 
tion, and it is the relative importance of the off-diagonal 
matrix elements relative to the diagonal elements that 
sets the shape of the curve in Fig. [3| 

For very small values of |/nl| (<C 10), the off-diagonal 
elements of the covariance are very small, and hence do 
not contribute much to the signal, or to C. This can be 
seen in the /nl = curves in the right panel of Fig. [T] 
and in the right panel of Fig. |3] Note that the plots hide 
the fact that the number of pixels at a given separation 
increases with separation: the number of off-diagonal el- 
ements in the covariance is much bigger than the number 
of diagonal elements, and this gives a '"geometric boost" 
to the covariance. 

For large values of |/nl| {'> 10), the off-diagonal ele- 
ments of the covariance matrix can be significant relative 
to the diagonal elements (see the /nl = ±100 curves in 
the right panel of Fig. [T]). Therefore, the constraints on 
/nl now worsen with the increasing value of I/nlI, albeit 
slowly. 

Finally, in the intermediate range of |/nl | ^ 10, the off- 
diagonal elements of C are small relative to the diagonal 
and near-diagonal elements. For example, the right panel 
of Fig. [T] shows that, for /nl = 20, the far-separation co- 
variances are much smaller than the variances. However 
the derivatives of the sample covariance, dS/d/NL, are 
only moderately smaller for the off-diagonal pixels than 
for the diagonal ones (e.g. a factor of ~ 4 for /nl = 20; 
see the right panel of Fig. l3|. Therefore, it is at these 
intermediate values of |/nl1~ 10 that we find the best 
signal-to-noise, and best constraints on /nl- 

In summary, the dependence on the fiducial value of 
/nl can be understood rather simply. For small /nl, the 
large-scale covariances do not add much signal. For large 
/nl the covariances add too much noise. At intermedi- 
ate /nl, the signal-to- noise relation is "just right". We 
caution that the shape of the curve in Fig. [3] depends on 
the volume (and geometry) of the survey as well as in 
the number density of sources. The width of the pixels 
affect the width of the central part of the "hat" slightly. 
Smaller bins tend to shift the minima to smaller values 
of |/nl|- We conclude that the power of a DES-like clus- 
ter surveys to rule out the Gaussian hypothesis may be 
even greater than indicated in Tables in this paper, since 
the error at /nl 7^ nearly always smaller than that 
for /nl = 0. This is another exciting development, but 
warrants further investigation, and in particular a more 
detailed study of the dependencies on the overall sur- 
vey volume and selection. In this initial study we simply 
adopt the conservative errors, and show the /nl = re- 
sults everywhere except in Fig. [3] 
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Fiducial f^^ Angular separation [deg.] 



FIG. 3: Left panel; Unmarginalized 1 — cr constraints on /nl as a function of the fiducial value of this parameter, assuming 
five redshift and five mass bins. The "witch's hat" shape can be explained from the competition between the derivative of the 
covariance with respect to /nl, and the total covariance at the fiducial /nl; see text. Right panel: Derivative of the signal 
matrix elements Sij with respect to /nl as a function of angular separation between pixels i and j, for /nl = —40, —20, 0, 20, 
and 40. Recall that, at z = 0.5, a separation of 1 degree corresponds to about 23^~'^Mpc. 



B. Clusters vs. galaxies 

It is useful to compare cluster constraints obtained 
here with the expected constraints from a similar, DES- 
type, galaxy survey. Forecasts of constraints on primor- 
dial non-Gaussianity from galaxy clustering were studied 
recently 35, [73] using the Fisher matrix and a sim- 
ple, Feldman-Kaiser- Peacock (FKP (SHj) estimator that 
counts modes of P{k) and combines them with the sur- 
vey volume and its galaxy density. Perhaps counterintu- 
itively, our constraints are a factor of a few better than 
those from galaxies estimated previously. We now ex- 
plain the origin of this apparent discrepancy. 

Both clusters and galaxies probe the power spectrum 
of dark matter halos (and thus the halo bias). However, 
there are some important differences 

• Clusters additionally probe the mass function, 
which determines the counts, and also weakly af- 
fects the bias bo{M,z); see Eqs. (B6) and (B7); 



• The number density of galaxies may be significantly 
higher, depending on how they and the clusters are 
selected. However, as mentioned in Sec. HI the 
larger size of galaxy samples may not bring much 
additional information, since the constraints on /nl 
benefit from very large-scale halo separations, and 
not from intra-halo correlations; 

• Clusters reside in more massive halos than galaxies, 
and thus have a higher bias. The higher the bias, 
the stronger is the correlation (cf. Eq. l9|; 



• With regards to systematics, clusters can natu- 
rally be binned by the mass-observable, which helps 
break degeneracies with nuisance parameters. This 
allows utilization of the cross-correlation between 
different mass bins to reduce the impact of sample 
variance (e.g. [631 164) ). which we do not exploit in 
this paper. 

• Large spectroscopic samples of galaxies are ex- 
pected in the near future, whereas clusters will 
rely on photometric redshifts; therefore, galaxy red- 
shifts are likely to be more accurate than cluster 
redshifts; 

Given all these differences, it is difficult to predict 
whether clusters or galaxies will give a stronger con- 
straints on primordial non-Gaussianity without a direct 
calculation. We have verified that the FKP estimator 
of galaxy constraints on /nl indeed gives a weaker re- 
sult, and is in rough agreement with previous estimates 
in |35l|45l|46]. 

However, as discussed in Tegmark et al. [69 , the 
FKP estimator is only optimal and lossless on scales 
much smaller than the linear size of the survey. Since 
good constraints on /nl benefit from precisely the large- 
wavelength modes, it is not surprising that the FKP es- 
timator for galaxies indicates worse constraints than our 
pixel-based estimator for clusters. We have additionally 
verified that constraints on the constant part of the bias, 
Bq (see Eq. ([2])), or the dark energy equation of state 
w, which do not benefit as much from large- wavelength 
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modes, are comparable when estimated from the pixel- 
based formalism (from this paper) and the FKP approach 
assuming the same survey volume and number density of 
objects. 



C. Comparison to previous work 

Numerous papers have studied the power of cluster 
counts alone to probe primordial non-Gaussianity (e.g. 

To the extent that such constraints are gen- 
erally weak due to degeneracies, and strongly depend on 
the priors and nuisance parameters varied, our results 
(see the "counts" columns in Table III) are in broad 
agreement with these studies. 



A more interesting comparison can be made with 
the recent work of Oguri [52] who studied the 
counts+variance case of clusters, corresponding to re- 
sults in our Table HIl The main difference between the 
two studies is that we additionally considered the co- 
variance of cluster counts, and found that it leads to a 
huge further improvement in the constraints. However, 
even for the counts+variance case only, our results differ 
substantially, and we forecast a much weaker constraint 
on non-Gaussianity than Ref. [52] . For example, we get 
(^(Inl) ^ 20-30 compared to cr(/NL) ^ 8 in Ref. [52] 
in the most fair comparison with their DES survey case 
and our assumptions with either no nuisance parame- 
ters or full mass-observable nuisance parameters^. These 
discrepancies could probably be explained by a num- 
ber of other differences in the analyses: mass functions 
(Ref. |S5| uses the Lo Verde et al. [7^ mass function with 
analytic fit for skewness, while we use Dalai et al. mass 
function from Eqs. Q-Q); cosmological parameter pri- 
ors (Ref. [52) uses the diagonal priors on some parameters 
while we use the full, off-diagonal Planck prior Fisher ma- 
trix), etc. We have not attempted to reproduce results 
from Ref. [52] using the assumptions made in that paper. 



D. Issues for future study 

There are a number of effects that remain to be studied 
in detail, but are beyond the scope of this preliminary 
analysis. We now list them here: 

• Fisher matrix approximation: in this paper we have 
assumed the fiducial value of /nl — and calcu- 
lated the errors on /nl by taking the derivatives of 
observables with respect to this parameter. This 
"Fisher error" will be a good approximation to 
the true error if the error itself is small. There- 
fore, at least in the cases where the /nl error is 
tight, we expect the Fisher approximation is a good 



Ref. 1521 assumes only two mass-observable nuisance parameters. 



one, though this should eventually be checked with 
Markov chain Monte Carlo methods. 

• Calculational issues: The computation of the clus- 
ter covariance is time consuming, particularly for 
small but non-zero values of /nl- In this work we 
have largely avoided this issue by using the Fisher 
matrix approximation and taking analytic deriva- 
tives around /nl = (and a few other values), 
which enabled us to only evaluate the covariance 
at the fiducial Gaussian model. With real data, 
however, a full exploration of parameter space will 
be necessary, which might be sufficiently time con- 
suming to warrant analysis using a smaller set of 
observable parameters. For example, one could re- 
sort to using larger pixels and a coarser binning 
in redshift, or perhaps using no pixels at all. One 
could also explore speeding up the covariance cal- 
culations with various mathematical tricks. 

• Mass function: we have assumed the Dalai et al. 
[35] mass function which has been calibrated from 
numerical simulations and simply shifts the mass 
of halos with non-Gaussianity. A number of al- 
ternative mass functions have recently been pro- 
posed in the literature and studied numerically 
[42l [70] . While the agreement in the relevant quan- 
tity riNoiM, z) /noiM, z) is becoming good, there 
is still no uniform agreement in the community 
about the convergence. The overall constraints are 
expected to be robust given that most of the ef- 
fect of non-Gaussianity comes from the bias scaling 
as /nl^^^ and not the mass function. Neverthe- 
less, we expect constraints in this paper to be on 
the conservative side: given that the Dalai et al. 
mass function predicts a smaller effect due to non- 
Gaussianity than some of the other popular func- 
tions, use of these other mass functions would only 
increase the effects due to non-Gaussianity and thus 
improve the error bars on /nl- 

• Corrections to the bias formula: While the de- 
pendence of bias on /nl is established to follow 
Eq. ([2| both analytically and numerically, it could 
be that there are second-order corrections to the 
bias formula. These have been discussed in the lit- 
erature; for example, it appears that a small con- 
stant offset in bias is warranted by the simulations 
and some analytical results [H] [331 |33|. Study 
of these higher-order corrections is very important 
but, given that there is no convergence in the com- 
munity on this issue as of yet, we leave their inclu- 
sion for future work. 

• Relativistic corrections and gauge dependence: 
Wands and Slosar [65] have shown that, to first- 
order, the scale-dependent bias does not receive rel- 
ativistic corrections at large scales, using a spheri- 
cal collapse model. However, other authors have 
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shown that higher-order corrections in the mat- 
ter perturbations can produce non-Gaussianity (see 
e.g. [HllZl])- How the higher-order correction prop- 
agate to the halo bias is yet to be understood in 
detail. 

• Observational systematics: In this paper we have 
modeled the systematic uncertainties in under- 
standing of the Gaussian bias bQ{M, z) and the re- 
lation between cluster mass and its observational 
proxy by introducing nuisance parameters that de- 
scribe uncertainty in these relations. However, we 
have not attempted to model observational uncer- 
tainties, such as variations in atmospheric seeing 
or photometric calibration. Clearly, knowledge of 
such uncertainties over large angular scales will be 
important if measurements of non-Gaussianity are 
not to be substantially degraded. We leave the 
study of observational systematics for future work. 



VI. CONCLUSIONS 

In this paper we studied how well primordial non- 
Gaussianity of the local type can be probed with galaxy 
clusters. We took into account cluster number counts, as 
well as the full covariance of cluster counts-in-cells. We 
allowed generous uncertainties in the knowledge of the 
cluster mass-observable relation, the photometric red- 
shifts, and the Gaussian halo bias (we did not consider 
systematics due to uncertainties in angular selection, 
which may be important.) As we discuss at length in 
Sec. |III[ the Fisher matrix calculation is computationally 
challenging, and we resorted to a number of conservative 
approximations, the most important of which is using 
very large pixels. Since angular selection issues are ex- 
pected to be most significant at small angular scales, our 
pixel choices partly justify neglecting angular uncertain- 
ties. 

We found that most information on primordial non- 
Gaussianity comes from the previously neglected covari- 
ance of counts. The covariance links cluster overdensities 
across large distances, and thus benefits the constraints 
on primordial non-Gaussianity of the local type. The 
reason is easy to understand: the non-Gaussian param- 
eter /nl enters through the term proportional to k~'^ in 
the bias, and correlates cluster counts in bins separated 
by hundreds of megaparsecs. Other cosmological param- 
eters do not lead to these far-separation correlations in 
cluster counts (see the right panel of Fig.[T]). Correlations 
of cluster counts across vast spatial distances of hundreds 
of megaparsecs therefore represent a smoking-gun signa- 
ture of primordial non-Gaussianity of the local type. 

The combination of counts and clustering is particu- 
larly effective at breaking degeneracies of /nl with cos- 
mological and nuisance parameters, since the two statis- 
tical probes complement each other very well. While our 
full set of 23 freely varying nuisance parameters can de- 



grade /nl constraints by factors of a few, even modest 
prior uncertainties on some of them break degeneracies 
and restore the accuracy in /nl- For example, the bias 
in each photo-z bin needs to be known to 0.01 to keep 
/nl constraints within 15% of their values for the case of 
perfectly known photo-z's. 

We investigated the sensitivity of our results to the 
choice of fiducial value of /nl and found that the uncer- 
tainty in /nl at /nl 7^ is smaller than that for /nl — 0. 
In other words, a non-zero small value of /nl may even 
be more sensitively differentiated from the /nl = case 
than indicated in our Tables; the reason for this is ex- 



plained in Sec. V A 



Our forecasts indicate very strong constraints on pri- 
mordial non-Gaussianity, which is perhaps surprising. 
However, closer inspection reveals a number of effects 
that help clusters achieve these numbers; we discuss these 
in Sec. |VB| In particular, we use the pixel-based estima- 
tor, which is well suited for extracting signal from very 
large scales. Previous error forecasts of non-Gaussianity 
from galaxy clustering used the suboptimal FKP esti- 
mator; dark-energy studies that did use the pixel-based 
estimator only considered variance of cluster counts. 

To achieve the full potential of forecasted constraints 
discussed here, a few more issues need to be carefully 
studied. Particularly important are theoretical uncer- 
tainties in linking dark matter halos to observed clusters 
of galaxies, and observational systematics across large 
angular scales. While constraints on primordial non- 
Gaussianity have improved two orders of magnitude be- 
tween COBE 73 and WMAP [22], another one or even 
two orders of magnitude improvement may be possible 
with upcoming surveys of large-scale structure, especially 
if they include both dark matter halo counts and their 
clustering covariance. 
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Appendix A: Parametrization of mass-observable 
relation 



Appendix B: Photometric redshift errors and 
Gaussian halo bias 



We assume a log-normal form for the probability of 
measuring an observable signal, denoted Mobs, given true 
mass M, 



p{Mobs\M) 



M 



■exp[~x^{Mobs)], (Al) 



where 



x(Mobs)^ 7T7 , (A2) 

v2crinj\/(Aiobs, z) 

For the optical survey, the mass threshold of the ob- 
servable is set to M**^ = 10^^ ''h~^MQ and the redshift 
limit is z = 1, corresponding to the projected sensitiv- 
ity of the Dark Energy Survey. Different studies suggest 
a wide range of scatter for optical observables, ranging 
from a constant crinj\/ — 0.5 [75] to a mass-dependent 
scatter in the range 0.75 < crinAf < 1-2 [76|. Using weak 
lensing and X-ray analysis of MaxBCG selected optical 
clusters, Ref. [77] estimated a lognormal scatter of ^ 0.45 
for P{M\Mohs), where M was determined using weak 
lensing and Mobs was an optical richness estimate. We 
choose a fiducial mass scatter of ainM = 0.5 and allow 
for a cubic evolution in redshift and mass: 



lnM^'^^(A/obs,2) 



InM, 



bias 



Oi ln(l + z) 



+ a2(lnA/obs-lnMpivot), (A3) 

3 
1=1 

3 

+ c»(lnMobs - lnMpivot)'.(A4) 



i=l 



We set Mpivot — 10^^ h ^Mq. In all, we have 10 nuisance 
parameters for the optical mass errors (IuMq'^'', oi, 02, 

There are few, if any, constraints on the number of pa- 
rameters necessary to realistically describe the evolution 
of the variance and bias with mass. Ref. [54] shows that a 
cubic evolution of the mass-scatter with redshift captures 
most of the residual uncertainty when the redshift evolu- 
tion is completely free (as assumed in the Dark Energy 
Task Force (DETF) report [78]). While generous, this 
parametrization assumes a lognormal distribution of the 
mass-observable relation that may fail for low-masses (see 
e.g. [73]). However, [ST] show that more complex distri- 
butions do not degrade results substantially (~ 20 — 30% 
for the test case assumed by the authors). We have also 
implicitly assumed that selection effects can be described 
by the bias and scatter of the mass-observable relation. 
By the year 2016, we expect significant progress in sim- 
ulations of cluster surveys that will allow us to better 
parametrize the cluster selection errors. 



Uncertainties in the redshifts distort the volume el- 
ement. Assuming photometric techniques are used to 
determine the redshifts of the clusters (hereafter photo- 
z's), and a perfect angular selection the mean number of 
clusters in a photo-z bin zf < z"^ < zf_^-^ is 



dz'P I dVnc,Wl^{n)p{zP\z) (Bl) 



where Wl^{fl) is an angular top hat window function. We 
parametrize the probability of measuring a photometric 
redshift, zP, given the true cluster redshift z as [5S] 



p{z^\z) 



1 



:exp[V(^'^)] (B2) 



where 



-.bias 



viz") 



(B3) 



^.bias |.-|^g photometric redshift bias and az is the scatter 
in the photo-z's. 

On large scales, the number counts of clusters m(x) 
trace the linear density perturbation 5(x) 

m,(M„,x) = TO,„ = m,(l + fe(M„,z)(5(x)) (B4) 

where Mq, denotes a bin in mass and i refers to the pixel 
on the sky defined by its angular location and redshift. 
The (Gaussian) halo bias may be very roughly approxi- 
mated by [50] 



6o(M;z) = l + 



2pc 



(B5) 



with flc = 0.75, Pc — 0.3, and 5c — 1.69. Here ct(M, z) is 
the amplitude of mass fluctuations on scale M , defined 
as usual by 



where W{x) — ?,jx{x)/x (the top-hat window), R = 
(3M/47rp™)i/3, and P(k) and pm are the matter power 
spectrum and energy density respectively. 

Integrating the expression above yields the average 
cluster linear bias: 



ba,i{z) 



1 



obs 



^a,i{z) J M; 

^ dhiM 



dM 



Mobs 

6(M;z)p(Mobs|M). (B7) 
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